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(54) APPARATUS FOR PROVIDING INFORMATION, INFORMATION RECEIVER AND STORAGE 
MEDIUM 



(57) The present invention comprises a synchroni- 
zation section for performing synchronization of an AV 
stream and metadata, and a capsulization section for 
capsulizing an AV stream and metadata every metadata 
unit, and by reconfiguring metadata unit by unit and cap- 



sulizing an AV stream by this means, makes possible 
partial execution of metadata, and makes it possible to 
carry out program distribution for processing a segment 
comprising part of an AV stream, speeding up of re- 
sponse times, reduction of the necessary storage ca- 
pacity, and reduction of network traffic. 
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Description 

Technical Field 

[0001] The present invention relates to an information 
provision apparatus, information receiving apparatus, 
and storage medium, and relates in particular to an in- 
formation provision apparatus, information receiving ap- 
paratus, and storage medium for video/audio, data, etc., 
operating via broadcast media such as digital broad- 
casting and communication media such as the Internet. 

Background Art 

[0002] In recent years, there has been an active trend 
of digitization of broadcasting, and fusion with com- 
munications has also progressed. In the field of commu- 
nications, satellite digital broadcasting has already been 
started, and it is expected that terrestrial broadcasting 
will also be digitalized in the future. 
[0003] As a result of digitization of broadcast con- 
tent, data broadcasting is also performed in addition to 
conventional video and audio. Also, in the communica- 
tions field, digital content distribution via the Internet has 
begun with music, and Internet broadcasting stations 
that broadcast video have also appeared. 
(0004] Henceforth, it is envisaged that continuous 
content media such as video and audio will enter the 
home via various paths (transmission media). Through 
such fusion and digitization of communications and 
broadcasting, it has become possible to offer previously 
unavailable services by means of metadata that de- 
scribes content or relates to content. 
[0005] For example, EPG information as well as au- 
dio/video information is provided by interleaving EPG 
("Electric Program Guide)- n Standard specification for 
program arrangement information used in digital broad- 
casting ARIB STD-B1 0 Version 1 .1 " or "pr ETS 300 468 
Digital Broadcasting systems for television, sound and 
data services-Specification for Service Information (SI) 
in Digital Video Broadcasting (DVB) systems") used in 
CS digital broadcasting, in an audio/video PES (Pack- 
•etized Elementary Stream) using an MPEG-2 (Motion 
Picturecoding Experts Group phase 2-"ISO/IEC 
1381 8-1 to 3") private section. 

[0006] Also, in BS digital broadcasting, data broad- 
casting using MPEG-2 private PES packets is anticipat- 
ed. Moreover, it is also possible to perform content man- 
agement by inserting metadata that describes content 
in the format of user data in material transmission ("AN- 
SI/SMPTE 291M-1996 Ancillary Data Packet and 
Space Formatting"). 

[0007] A conventional information processing system 
will be described below using FIG.1 5. FIG. 15 is a block 
diagram of a conventional information processing sys- 
tem. 

[0008] An information provision node 1501 is provid- 
ed with a storage section 1502 in which an AV stream 



and metadata for describing the AV stream are stored. 
Also provided in the information provision node 1501 is 
an information provision section 1504 that multiplexes 
the AV stream and metadata stored in the storage sec- 
5 tion 1502 and generates and outputs a multiplex stream 
1503. The information provision section 1 504 transmits 
the multiplex stream 1503 to an information usage node 
1506 via a network 1505. 

[0009] Meanwhile, the information usage node 1506 
10 is provided with an information usage section 1507 that 
extracts an AV stream and metadata from a multiplex - 
stream and executes processing on them in order to use 
them. The information usage node 1506 is also provided 
with a storage section 1508 that stores the AV stream 
*5 and metadata extracted by the information usage sec- 
tion 1507. The information usage section 1507 reads the 
AV stream and metadata stored in the storage section 
1508 in order to use them. 

[0010] Next, the information provision section 1504 
20 will be described using FIG.1 6. FIG.1 6 is a block dia- 
gram of a conventional information provision section. 
[0011] The information provision section 1504 is pro- 
vided with an access section 1-601 that reads an AV 
stream and metadata from the storage section 1502. 
The access section 1601 outputs an AV stream 1602 
and metadata 1603 to a multiplexing section 1604. 
[0012] The multiplexing section 1 604 transmits to the 
information usage node 1506 a multiplex stream 1503 
that multiplexes the AV stream 1602 and metadata 
30 1603. 

[0013] Next, multiplex stream generation processing 
by the multiplexing section 1604 will be described using 
FIG. 17. 

[0014] The drawing indicated by reference numeral 

35 1503 in the drawing shows the MPEG-2 TS (Transport 
Stream) PES packet layer, and shows a multiplex 
stream. The drawing indicated by reference numeral 
1702 shows a video PES packet, the drawing indicated 
by reference numeral 1703 shows an audio PES packet, 

40 and the drawing indicated by reference numeral 1703 
shows a private PES packet. 1603 indicates the meta- 
data PES packet layer, in which 1 704 is a first PES pack- 
et comprising metadata and 1705 is a second PES 
packet comprising metadata. 

45 [0015] The multiplexing section 1604 divides the 
metadata 1 603 to make private PES packets, inserts the 
first PES packet 1 704 and second PES packet 1705 in 
order as appropriate between AV streams consisting of 
video PES packets 1701 and audio PES packets 1702, 

so and obtains a multiplex stream 1 503 that is an MPEG- 
2 TS. 

[0016] As conventional metadata is AV stream ancil- 
lary data-for example, small amounts of data such as 
titles-processing has been performed with metadata 
55 alone. That is to say, it has not been necessary to pro- 
vide time synchronization of metadata with an AV 
stream. Therefore, since conventional metadata does 
not have a configuration that provides for synchroniza- 
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tion with an AV stream, metadata has been packetized 
using virtually the same size, and has been inserted as 
appropriate between AV streams at virtually equal inter- 
vals. 

[0017] The multiplexing section 1604 then sends this 
multiplex stream 1503 to the information usage node 
1506. 

[0018] Next, the information usage section 1507 will 
be described using FIG. 18. FIG. 18 is a block diagram 
of a conventional information usage section. 
[0019] The information usage section 1 507 is provid- 
ed with an extraction section 1 803 that performs sepa- 
ration and extraction, and output, of an AV stream 1 801 
and metadata 1802. The extraction section 1803 out- 
puts the separated and extracted AV stream 1 801 and 
metadata 1802 to an access section 1804. 
[0020] The access section 1 804 stores the AV stream 
1801 and metadata 1802 input from the extraction sec- 
tion 1803 in a storage section 1508. Also, the access 
section 1 804 outputs the AV stream 1 805 and metadata 
1806 read from the storage section 1508 to a display 
section 1807. The display section 1807 displays either 
or both of the AV stream 1 805 and metadata 1 806 input 
from the access section 1 804. 

[0021] Next, the processing of the information usage 
section 1507 will be described using FIG. 19. FIG. 19 is 
a processing flowchart of a conventional information us- 
age section. 

[0022] The extraction section 1 803 performs metada- 
ta parsing-that is, syntax analysis (ST1901). Then, ex- 
ecution of the processing of the access section 1804 
and display section 1807 is performed (ST1902). 
[0023] In this way, a conventional information 
processing system can display a description relating to 
AV information, in addition to AV information, by means 
of the information usage node 1506 by having the infor- 
mation provision node 1501 transmit a multiplex stream 
multiplexing an AV stream and metadata to the informa- 
tion usage node 1506. 

[0024] In recent years, a demand has arisen for vari- 
ous kinds of information to be included in metadata, and 
for metadata to be processed coupled with an AV 
stream, rather than having metadata simply as ancillary 
data for an AV stream. 

[0025] However, in the above-described conventional 
information processing system, metadata parsing can- 
not be carried out until all the metadata has been ac- 
quired. For example, if metadata begins with <metada- 
ta>, metadata parsing cannot be carried out until data 
</metadata> indicating the end of the metadata arrives. 
[0026] For this reason, the metadata processing time 
is closely tied to the AV stream display or processing 
time, and since an AV stream is processed in accord- 
ance with the metadata itself, processing cannot be 
started until all the metadata has been received. There- 
fore, in a conventional information processing system, 
there is a problem in that it is difficult to process an AV 
stream in small units. 



[0027] Also, metadata is distributed virtually uniformly 
in a multiplex stream. As a result, especially when the 
data quantity of metadata is large, a large AV stream 
quantity must be read by the time all the metadata is 
5 read. Consequently, there are problems relating to inter- 
node response time delays and increased network traf- 
fic. 

Disclosure of Invention 

10 

[0028] It is a first objective of the present invention to 
carry out data and program distribution for processing 
a segment comprising part of an AV stream, speeding 
up of response times, reduction of the necessary stor- 
es age capacity, and reduction of network traffic, by making 
possible partial execution of metadata. 
[0029] Also, it is a second objective of the present in- 
vention to make processing of a segment comprising 
part of an AV stream variable, and perform close syn- 
20 chronization between metadata and AV stream process- 
ing times, by implementing time synchronization of 
metadata and an AV stream. 

[0030] Further, it is a third objective of the present in- 
vention to extend the degree of freedom for designing 

25 metadata for processing an AV stream. 

[0031 ] In order to meet the first objective, the present 
invention is provided with a synchronization section 
which synchronizes a data stream segment with a unit 
of metadata corresponding to it, and a capsulization 

30 section which capsulizes a data stream packet and 
metadata unit packet after synchronization and gener- 
ates a capsulized stream. 

[0032] By this means, partial execution of metadata 
is made possible by reconfiguring metadata unit by unit 

35 and capsulizing it with the data stream. As a result, it is 
possible to carry out data and program distribution for 
processing a segment comprising part of a data stream, 
speeding up of response times, reduction of the neces- 
sary storage capacity, and reduction of network traffic. 

40 [0033] In order to meet the second objective, the 
present invention is provided with an extraction section 
which extracts from a capsulized stream a content data 
stream and metadata for describing or processing that 
content, a synchronization section which synchronizes 

45 metadata unitized with respect to an extracted data 
stream segment unit by unit with a content data stream 
and the corresponding metadata unit, and a processing 
section which processes synchronized metadata unit by 
unit. 

so [0034] By this means, it is possible to make process- 
ing for a segment comprising part of a data stream var- 
iable, and perform close synchronization between meta- 
data and AV stream processing times. 
[0035] In order to meet the third objective, the present 

55 invention uses a structured description for metadata and 
metadata units, and structured description re-format is 
performed from metadata to units and from units to 
metadata. 
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{0036] By this means, it is possible to extend the de- 
cree of freedom for designing metadata for processing 
a data stream. In addition, it is possible for a structured 
description written in XML, etc., to be used directly as 
metadata. 

Brief Description of Drawings 
[0037] 

FIG.1 is a block diagram of an information process- 
ing system according to Embodiment 1 of the 
present invention; 

FIG.2 is a block diagram of an information process- 
ing section according to Embodiment 1 ; 
FIG.3A is a drawing showing an AV stream accord- 
ing to Embodiment 1 ; 

F1G.3B is a drawing showing metadata according 
to Embodiment 1 ; 

FIG.4A is a drawing showing DTD of XML of meta- 
data according to Embodiment 1; 
FIG.4B is a drawing showing DTD of XML of an 
MPU according to Embodiment 1 ; 
FIG.5A is a drawing showing an instance of XML of 
metadata according to Embodiment 1 ; 
FIG.5B is a drawing showing an instance of XML of 
an MPU according to Embodiment 1 ; 
FIG. 6 is a drawing showing the syntax of metadata 
according to Embodiment 1 ; 
FIG. 7 is a drawing for explaining the operation of a 
capsulization section according to Embodiment 1 ; 
FIG.8 is a block diagram of an information usage 
section according to Embodiment 2 of the present 
invention; 

FIG. 9 is a processing flowchart showing the meta- 
data processing operations of an information usage 
node according to Embodiment 2 of the present in- 
vention; 

FIG. 10 is a block diagram of an information usage 
section according to Embodiment 3 of the present 
invention; 

FIG. 11 is a block diagram of an information usage 
section according to Embodiment 4 of the present 
invention; 

FIG. 12 Is a block diagram of an information 
processing system according to Embodiment 5 of 
the present invention; 

FIG.1 3 is a block diagram of an information 
processing section according to Embodiment 5; 
FIG.1 4 is a block diagram of an information usage 
section according to Embodiment 4 of the present 
invention according to Embodiment 6; . 
FIG. 15 is a block diagram of a conventional infor- 
mation processing system; 

FIG.1 6 is a detailed drawing of a conventional in- 
formation provision section; 

FIG.1 7 is a drawing showing the configuration of a 
conventional multiplex stream; 



FIG.1 8 is a detailed drawing of a conventional in- 
formation usage section; and 
FIG.1 9 is a processing flowchart for a conventional 
extraction section. 

5 

Best Mode for Carrying out the Invention 

[0038] With reference now to the attached drawings, 
embodiments of the present invention will be explained 
10 in detail below. 

(Embodiment 1) 

[0039] An information processing system according 

15 to Embodiment 1 of the present invention will be de- 
scribed below. FIG.1 is a block diagram of an informa- 
tion processing system according to Embodiment 1 . 
[0040] An information provision node 101 is provided 
with a storage section 102 in which an AV stream and 

20 AV stream related metadata are stored. The metadata 
is data that describes the related AV stream, or data for 
processing the metadata itself, or the like. Also provided 
in the information provision node 101 is an information 
provision section 104 that multiplexes the AV stream 

25 and metadata stored in the storage section 1 02 and gen- 
erates and outputs a capsulized stream 103. The infor- 
mation provision section 1 04 transmits the capsulized 
stream 1 03 via a network 1 05 to an information usage 
node 106, which is an apparatus on the information re- 

30 ceiving side. 

[0041] Meanwhile, the information usage node 106 is 
provided with an information usage section 107 that ex- 
tracts an AV stream and metadata from the capsulized 
stream 1 03 and executes predetermined processing on 

35 them in order to use them. The information usage node 
106 is also provided with a storage section 108 that 
stores the AV stream and metadata extracted by the in- 
formation usage section 107. The information usage 
section 1 07 reads the AV stream and metadata stored 

40 in the storage section 1 08 in order to use them. 

[0042] Next, the information provision section 1 04 will 
be described using FIG.2. FIG.2 is a block diagram of 
an information provision section according to Embodi- 
ment 1 . 

45 [0043] The information provision section 104 is pro- 
vided with an access section 201 that reads an AV 
stream and metadata from the storage section 1 02. The 
access section 201 outputs an AV stream 202 and meta- 
data 203 to a synchronization section 204. 

so [0044] The synchronization section 204 implements 
time synchronization for the AV stream 202 and meta- 
data 203 read by the access section 201 , and outputs 
the synchronized AV stream 205 and metadata 206 to 
a capsulization section 207. 

55 [0045] The capsulization section 207 capsulizes the 
synchronized AV stream 205 and metadata 206, and 
transmits them to the information usage node 106 as a 
capsulized stream 1 03. 
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[0046] Also, the present invention unitizes metadata 
to enable metadata to be executed in parts. Then, AV 
stream segments and corresponding metadata units are 
synchronized, synchronized data stream packets and 
metadata unit packets are capsulized, and a capsulized 5 
stream is generated. 

[0047] The operation of the information provision sec- 
tion 104 of the present invention will be described in de- 
tail below. 

[0048] First, the AV stream 202 and metadata 203 
stored in the storage section 1 02 will be described using 
FIG.3A and FIG.3B. 

[0049] The AV stream 202 has video PES packets 301 
and audio PES packets 302 interleaved to form a 
stream. In the present embodiment, a mode is described 
whereby an AV stream 202 is stored in the storage sec- 
tion 102, but a mode is also possible whereby a video 
stream and audio stream are stored. 
[0050] The metadata 203 is configured so as to have 
a plurality of MPUs (Metadata Processing Units) 303. 
[0051] The thus configured metadata 203 and AV 
stream 202 are read from the storage section 1 02 by the 
access section 201 . Then the access section 201 out- 
puts the read AV stream 202 and metadata 203 to the 
synchronization section 204. 

[0052] On receiving the AV stream 202 and metadata 
203, the synchronization section 204 first proceeds to 
processing for unitizing the metadata 203. Here, defini- 
tions of the metadata 203 and MPU 303 will be de- 
scribed using FIG.4A and FIG.4B. FIG.4A and FIG.4B 
are drawings showing DTD of XML. In FIG.4A, 401 is a 
drawing showing a metadata definition (metadata.dtd) 
that defines the metadata 203. In FIG.4B, the drawing 
indicated by reference numeral 402 shows an MPU def- 
inition (mpu.dtd) that defines an MPU 303. 
[0053] The metadata definition 401 defines the meta- 
data 203 as having one or more MPUs 303. For the con- 
tents of an MPU 303, referencing the MPU definition 402 
is defined. 

[0054] The MPU definition 402 defines an MPU 303 
as having one or more elernent_data items. For the con- 
tents of element_data t referencing user_defined.dtd is 
defined. Also, the MPU definition 402 defines an MPU 
303 as having a serial number no assigned. 
[0055] In this way, it is possible to include in an MPU 
303 different processing contents for each of various 
services according to user_defined.dtd. Thus, it is pos- 
sible to extend the degree of freedom for designing 
metadata for processing an AV stream. 
[0056] Also, it is possible to include in an MPU 303 
processing contents not in accordance with a transmis- 
sion specification, according to user_defined.dtd. By 
this means, metadata can also be used for a different 
transmission specification, making it possible to provide 
metadata services that support a variety of transmission 
specifications. 

[0057] Next, the unitization of metadata 203 will be 
described using FIG. 5A and FIG. 5B. In FIG.5A, the 



drawing indicated by reference numeral 501 shows 
metadata (XML instance) whereby metadata 203 is giv- 
en a structured description according to metadata defi- 
nition 401, and the drawing indicated by reference nu- 
meral 502 shows an MPU (XML instance) whereby an 
MPU 303 is given a structured description according to 
MPU definition 402. 

[0058] As described above, according to metadata 
definition 401 , metadata 203 is represented by a collec- 
tion of MPU definitions 402. According to this metadata 
definition 401, what gives a structured description of 
metadata 203 is metadata (XML instance) 501 . As can 
be seen from the drawing, the metadata (XML instance) 
501 instance includes a plurality of MPUs 303. Also, 
metadata 203 is stored in the storage section 1 02 as 
metadata (XML instance) 501 . 

[0059] According to MPU definition 402, an MPU 303 
is represented by a collection of metadata defined by 
user_defined.dtd. According to this MPU definition 402, 
what gives a structured description of MPU 303 for each 
MPU is MPU (XML instance) 502. As can be seen from 
the drawing, MPU (XML instance) 502 includes a plu- 
rality of user_defined.dtd items. Also, MPU 303 is stored 
in the storage section 102 as MPU (XML instance) 502. 
[0060] An MPU 303 has contents <mpu> to </mpu>. 
That is to say, if there is information from <mpu> to <J 
mpu>, the synchronization section 204 can grasp MPU 
303 contents and can perform MPU 303 processing. For 
this reason, when picking out an MPU 303 from meta- 
data 203, the synchronization section 204 extracts the 
contents on the inside of a tag called an MPU tag (here, 
<mpu>) defined by an MPU definition 402. 
[0061] By having metadata 203 composed of lower- 
level information MPUs 303 in this way, the synchroni- 
zation section 204 can perform metadata 203 process- 
ing for each MPU 303, and also closely synchronize the 
AV data 202 and metadata 203. 

[0062] Next, the synchronization section 204 cap- 
sulizes metadata 203 sent from the access section 201 
using the syntax shown in FIG.6. FIG.6 shows the syn- 
tax of metadata according to Embodiment 1 and Em- 
bodiment 2. 

[0063] In FIG.6, metadata_type 601 is the metadata 
type such as position information, content information, 
or program. metadata_subtype 602 is the concrete 
metadata type such as GPS or structured description 
(MPEG-7). MPUJength 603 is the data length as a 
number of bytes from immediately after the MPU_length 
field to the end of the MPU. An MPU is composed of one 
or more PES packets, and is the regeneration unit of 
metadata divided when a Metadata Elementary Stream 
is encoded. media_sync_flag 604 is a flag indicating the 
presence or absence of synchronization between the 
AV stream and metadata. overwrite_flag 605 is a flag 
indicating whether the previous metadata is to be over- 
written. element_datajength 606 is the data byte length 
(M) of element_data 609. start_Jime() 607 is the start 
time of a segment that is a part of the AV stream indi- 
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cated by the metadata, du ration () 608 is the continuation 
time of a segment that is part of the AV stream indicated 
by the metadata. element_data 609 is the actual data of 
the metadata. 

[0064] For the syntax shown in FIG.6, coding uses 
syntax 610 from else downward even when the meta- 
data data quantity is small and unitization is not per- 
formed. 

[0065] The synchronization section 204 capsulizes 
the AV stream segment for processing specified by the 
first packet's processing start time 607 and duration 
608, and part of the metadata 203 corresponding to the 
segment for processing, as a capsulized stream (private 
PES). 

[0066] When metadata 203 is PES-packetized, an 
MPU 303 is packetized together with the AV stream seg- 
ment first packet processing start time (start_time), du- 
ration() 608, and actual data of the metadata as an ele- 
ment (element_data) in the metadata syntax shown in 
FH3.6. 

[0067] By this means, it is possible for an MPU 303 to 
have information for maintaining synchronization with 
the AV stream 202. Thus, synchronization is maintained 
between the MPU 303 and AV stream 202. In this way, 
metadata 203 operation can be determined on the in- 
formation provision node 101 side. 
[0068] Also, in Embodiment 1 , an MPU 303 is com- 
posed of two packets-a first PES packet 701 and a sec- 
ond PES packet 702-as shown in FIG. 7. The operations 
whereby the synchronization section 204 packetizes an 
MPU 303 into private PES packets and interleaves 
these with video PES packets 301 and audio PES pack- 
ets 302 in this case will be described using FIG.7. How 
many packets an MPU 303 is made into can be deter- 
mined arbitrarily according to the MPU 303 size and the 
packet size. 

[0069] In the case of Embodiment 1 , the first PES 
packet 701 and second PES packet 702 are placed as 
private PES packets 708 earlier in time than the first 
packet 703 so that the first PES packet 701 and second 
PES packet 702 are processed before the processing 
start time (start_time)705 of the first packet of the cor- 
responding AV stream segment. 
[0070] Also, the second PES packet 702 arrival time 
1 704 and the corresponding first packet 703 processing 
start time (start_time)705 difference At 706 are as- 
signed sufficient times for the information usage section 
107, which is on the information receiving side, to gen- 
erate an MPU 303 from the first PES packet 701 and 
second PES packet 702, and execute processing based 
on the contents of the generated MPU 303. 
[0071] Then, the AV stream 205 and metadata 206 
synchronized by the synchronization section 204 in this 
way are input to the capsulization section 207. 
[0072] The capsulization section 207 capsulizes the 
input AV stream 205 and metadata 206, and transmits 
them as a capsulized stream 103. 
[0073] As described above, according to Embodiment 



1 , metadata can be re-formatted unit by unit and cap- 
sulized with an AV stream by providing a synchroniza- 
tion section 204 that maintains synchronization of the 
AV stream and metadata, and a capsulization section 

s 207 that capsulizes metadata unit by unit with the AV 
stream. By this means, it becomes possible to perform 
partial execution of metadata, and to carry out program 
distribution for processing a segment comprising part of 
an AV stream, speeding up of response times, reduction 

10 of the necessary storage capacity, and reduction of net- 
work traffic. 

[0074] Moreover, according to Embodiment 1 , by us- 
ing a structured description written using XML for meta- 
data and metadata units, and performing structured de- 
15 scription re-format f rommetadata to units and from units 
to metadata, it is possible to provide extensibility for 
metadata for processing an AV stream, and extend the 
degree of freedom for designing metadata. In addition, 
it is possible for a structured description written in XML, 
etc., to be used directly as metadata. 

(Embodiment 2) 

[0075] Next, an information processing system ac- 
cording to Embodiment 2 of the present invention will 
be described. FIG. 8 is a block diagram of an information 
usage section 1 07 according to Embodiment 2. 
[0076] The information usage section 1 07 is provided 
with an extraction section 803 that performs separation 
and extraction, and output, of an AV stream 801 and 
metadata 802. The extraction section 803 outputs the 
extracted AV stream 801 and metadata 802 to an access 
section 804. 

[0077] The access section 804 records the AV stream 
801 and metadata 802 in a storage section 108. Also, 
the access section 804 reads an AV stream 805 and 
metadata 806 stored in the storage section 1 08, and out- 
puts them to a synchronization section 807. 
[0078] The synchronization section 807 performs 
time synchronization every MPU 303 for the AV stream 
805 and metadata 806 read by the access section 804, 
and outputs them to a core processing section 808. 
[0079] The core processing section 808 is provided 
with a display section 809. The display section 809 per- 
forms time synchronization and display of the input syn- 
chronized AV stream 810 and metadata 811. 
[0080] In this way, the information usage section 1 07 
extracts an AV stream 801 and metadata 802 from the 
capsulized stream 103 in the extraction section 803. 
Then, in the synchronization section 807, the corre- 
sponding metadata 802 unitized in accordance with AV 
stream 801 segments is synchronized with the AV 
stream 801 unit by unit. Then the synchronized meta- 
data 811 and AV stream 810 are displayed unit by unit 
by the display section 809. 

[0081] Next, the metadata processing operations of 
the information usage node 1 06 will be described in de- 
tail using the flowchart in FIG.9. First, the extraction see- 
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tion 803 extracts an AV stream and metadata from the 
received capsulized stream 103. In addition, the infor- 
mation usage section 107 performs MPU 303 pursing 
(ST901 ). Next, in the information usage section 107, a 
check is performed as to whether the MPUs 303 are to 
be merged and re-formatted as metadata 802 (ST902). 
Then, in the information usage section 107, a check is 
performed as to whether MPU 303 execution is to be 
performed unit by unit(ST903). 

[0082] If, in ST902 and ST903, the results confirmed 
by the information usage section 107 are MPU merging 
and MPU execution, processing is executed by the core 
processing section 808 (ST904). Then MPU merging is 
performed in the information usage section 107 
(ST905). In Embodiment 2, this processing is display 
processing, but it may also be conversion processing or 
transfer processing as in other embodiments to be de- 
scribed hereafter. 

[0083] Then, in the information usage section 107, 
judgment as to the advent of an MPU time or number 
limit-that is, an event that indicates an MPU processing 
unit-is performed (ST906), and ST904 and ST905 are 
repeated until the advent of an event. Event information 
is given to software when providing universality, or is giv- 
en to a terminal beforehand when the system is used in 
a fixed mode. 

[0084] Then, in the information usage section 107, 
rendering-that is to say, formatting-of the metadata is 
performed from the MPUs collected together in ST906. 
Metadata formatted on the basis of this event is stored 
in the storage section 108. Then the core processing 
section 808 reads this formatted data and performs var- 
ious kinds of processing. 

[0085] In this way, it is possible not only to perform 
processing for each MPU, which is the minimum unit of 
processing, in ST904, but also to perform processing 
based on data obtained by merging MPUs according to 
an event. 

[0086] By this means, it is possible to set arbitrarily a 
unit for MPU processing according to an event, and 
therefore the length of AV data segments for metadata 
processing can be made variable. That is to say, it is 
possible to process metadata for small AV data and to 
process metadata for huge AV data. For example, it is 
possible to update metadata display in short cycles in a 
case such as a vehicle navigation system, and update 
metadata in long cycles in a case such as a news pro- 
gram. 

[0087] Also, by storing this metadata that has been 
formatted on the basis of an event in the storage section 
108, it is possible to read and process this information 
by means of user operations. 

[0088] If, in ST902 and ST903, the results confirmed 
by the information usage section 107 are MPU merging 
and MPU non-execution, an MPU merge is performed 
(ST908). Then, in the information usage section 107, 
judgment as to the presence of an MPU time or number 
limit-that is, an event related to completion of an MPU 



merge-is performed (ST909), and ST908 is repeated 
until the occurrence of an event. Rendering of the meta- 
data is then performed from the MPUs collected togeth- 
er in processing P107. Then, in the information usage 

5 section 107, rendering-that is to say, formatting-of the 
metadata is performed from the MPUs collected togeth- 
er in ST906 (ST910). Metadata formatted on the basis 
of this event is stored in the storage section 108. Then 
the core processing section 808 reads thisformatted da- 

10 ta and performs various kinds of processing. 

[0089] In this way, it is possible not only to perform 
processing for each MPU, which is the minimum unit of 
processing, but also to perform processing based on da- 
ta obtained by merging MPUs according to an event. 

15 [0090] If, in ST902 and ST903, the results confirmed 
by the information usage section 107 are MPU non- 
merging and MPU execution, processing is executed 
sequentially (ST911). Then, in the information usage 
section 107, judgment as to the presence of an MPU 

20 time or number limit-that is, an event that indicates an 
MPU processing unit-is performed (ST912), and ST911 
is repeated until the occurrence of an event. 
[0091] In this way, it is possible to perform processing 
for each MPU, which is the minimum unit of processing, 

25 and not to perform processing based on data obtained 
by merging MPUs according to an event. 
[0092] If, in ST902 and ST903, the results confirmed 
by the information usage section 107 are MPU non- 
merging and MPU non-execution, no particular MPU- 

30 related processing is performed. 

[0093] As described above, the extraction method 
can be changed as appropriate according to the con- 
tents contained in MPUs 303. 

[0094] The operation of the information usage section 
35 1 07 will now be described below. The information usage 
section 107 extracts an AV stream 801 and metadata 
802 from the capsulized stream 1 03 input by the extrac- 
tion section 803, and outputs them to the access section 
804. After recording the AV stream 801 and metadata 
40 802 in the storage section 1 08, the access section 804 
reads an AV stream 805 and metadata 806, and outputs 
them to the synchronization section 807. The synchro- 
nization section 807 performs time synchronization eve- 
ry MPU 303 for the AV stream 805 and metadata 806 
is read by the access section 804, and outputs them to the 
core processing section 808. In the core processing sec- 
tion 808, the display section 809 performs time synchro- 
nization and display of the input AV stream 810 and 
metadata 811. 

so [0095] As described above, according to Embodiment 
2, close synchronization of the metadata and AV stream 
processing time can be performed by providing an ex- 
traction section 803 for separating and extracting an AV 
stream and metadata, an access section 804 for reading 

55 and writing an AV stream and metadata in a storage sec- 
tion 108, a synchronization section 807 for performing 
synchronization of the read AV stream and metadata 
processing, and a display section 809, which is a core 
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processing section 808. By this means, it is possible to 
vary processing for a segment, which is part of an AV 
stream. 

[0096] Also, information relating to the display method 
used by the display section 809 of the core processing 
section 808 can be provided as metadata. Information 
relating to the display method includes position informa- 
tion for displaying metadata related information, display 
size information, and display update information. 
[0097] By this means, an appropriate method for dis- 
playing metadata can be sent to the information provi- 
sion node 101 by the information usage node 106. As a 
result, metadata can be displayed appropriately by the 
information usage node 106. Therefore, if metadata is 
an advertisement or the like, it is possible to make a 
specification that allows the advertisement to be dis- 
played at the desired time, and if metadata is information 
related to program descriptions, it Is possible to display 
the descriptive information so as not to interfere with im- 
ages. 

[0098] Moreover, according to Embodiment 2, by us- 
ing a structured description written using XML for meta- 
data and metadata units, and performing structured de- 
scription re-format from metadata to units and from units 
to metadata, it is possible to extend the degree of free- 
dom for designing metadata for processing an AV 
stream, and a structured description written in XML, etc., 
•can be used directly as metadata. 

(Embodiment 3) 

[0099] Next, an information processing method ac- 
cording to Embodiment 3 of the present invention will 
be described. FIG. 10 is a block diagram of an informa- 
tion usage section 1001 according to Embodiment 3. 
Parts identical to those that have already been de- 
scribed are assigned the same reference numerals, and 
a description of these parts is omitted. 
[0100] The information usage section 1 001 according 
to Embodiment 3 has the core processing section 808 
of the information usage section 1 001 according to Em- 
bodiment 2 replaced by a core processing section 1 002. 
Below, the information usage section 1001 will be de- 
scribed centering on the core processing section 1002. 
[0101] The core processing section 1002 is provided 
with a transfer section 1003 and a capsulization section 
1006. 

[0102] The transfer section 1003 performs settings, 
such as a destination setting, for transferring an AV 
stream 810 and metadata 811 input from the synchro- 
nization section 807 to another information usage node. 
The transfer section 1003 performs time synchroniza- 
tion every MPU 303, and outputs an AV stream 1004 
and metadata 1005 to the capsulization section 1006. 
[0103] The capsulization section 1006 recapsulizes 
the input AV stream 1 004 and metadata 1 005 and trans- 
mits them to another node as a capsulized stream 1 007. 
Since the capsulization section 1006 recapsulizes the 



AV stream 1004 and metadata 1005 in this way, load 
sharing can be performed while maintaining close syn- 
chronization between the metadata and AV stream 
processing times. 
5 [0104] The operation of the capsulization section 
1006 is similar to that of the capsulization section 207 
according to Embodiment 1 , and so a detailed descrip- 
tion will be omitted here. 

[01 05] The operation of the information usage section 

10 1101 will now be described below. The information us- 
age section 1101 extracts an AV stream 801 and meta- 
data 802 from the capsulized stream 103 input by the 
extraction section 803, and outputs them to the access 
section 804. After recording the AV stream 801 and 

15 metadata 802 in the storage section 108, the access 
section 804 reads an AV stream 805 and metadata 806, 
and outputs them to the synchronization section 807. 
[0106] The synchronization section 807 performs 
time synchronization every MPU 303 for the AV stream 

20 805 and metadata 806 read by the access section 804, 
and outputs them to the core processing section 1002. 
The core processing section 1 002 performs settings for 
transferring the AV stream 810 and metadata 811 input 
by the transfer section 1 003 to another information us- 

25 age node, and performs time synchronization and out- 
put to the capsulization section 1006 every MPU 303. 
The capsulization section 1006 recapsulizes the input 
AV stream 1 004 and metadata 1 005 and transmits them 
to another node as a capsulized stream 1007. 

30 [0107] By configuring the information usage section 
1001 as described above, it is possible for the transfer 
section 1 003 to perform settings for transferring the AV 
stream 810 and metadata 811 input from the synchro- 
nization section 807 to another information usage node, 

35 perform time synchronization and output to the capsuli- 
zation step 23 every MPU 303, and for the capsulization 
section 1006 to recapsulize the AV stream 1004 and 
metadata 1 005 input from the transfer section 1 003 and 
transmit them to another node as a capsulized stream 

40 1 007. 

[0108] As described above, according to Embodiment 
3, it is possible for load sharing to be performed while 
maintaining close synchronization between the metada- 
ta and AV stream processing times, and also to make 

45 processing for a segment comprising part of a data 
stream variable, by providing in the information usage 
section 1001 an extraction section 803 for separating 
and extracting an AV stream and metadata, an access 
section 804 for reading and writing an AV stream and 

50 metadata in a storage section 108, a synchronization 
section 807 for performing synchronization of the read 
AV stream and metadata processing, and, in the core 
processing section 1002, a transfer section 1003 and a 
capsulization section 1006. 

55 [0109] Moreover, according to Embodiment 3, it is al- 
so possible for information about the processing meth- 
ods of the transfer section 1003 and capsulization sec- 
tion 1006, or a processing program itself, to be made 
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metadata. Processing method here refers to processing 
for changing the place where metadata is inserted ac- 
cording to the transfer destination, for instance. By this 
means, it is possible for the information provision node 
1 01 to send appropriate information for transferring and 
capsulizing metadata to the information usage node 
1 06. As a result, it is possible for metadata to be trans- 
ferred and capsulized appropriately by the information 
usage node 106. 

(Embodiment 4) 

[0110] Next, an information processing system ac- 
cording to Embodiment 4 of the present invention will 
be described. FIG. 11 is a block diagram of an informa- 
tion usage section 1101 according to Embodiment 4. 
Parts identical to those that have already been de- 
scribed are assigned the same reference numerals, and 
a description of these parts is omitted. 
[0111] The information usage section 1101 according 
to Embodiment 4 is equivalent to the information usage 
section 1 07 according to Embodiment 2 or the informa- 
tion usage section 1 001 according to Embodiment 3 
provided with a conversion section 11 02. Below, the in- 
formation usage section 1101 will be described center- 
ing on the conversion section 1102. 
[0112] The conversion section 1102 converts an AV 
stream 81 0 in accordance with metadata 811 , and out- 
puts the result to the core processing section 1 1 05 as a 
T-AV stream 1103 and T-metadata 1104. The conver- 
sion referred to here is color conversion according to the 
transmission destination terminal or display position, 
graphic information format conversion according to the 
transmission destination terminal or display position, or 
conversion of the voice format to an MP3 or portable 
phone format according to the transmission destination 
terminal. 

[01 1 3] The core processing section 1 1 05 operates in 
the same way as either the core processing section 808 
shown in Embodiment 2 or the core processing section 

1002 shown in Embodiment 3. 

[0114] If the core processing section 1105 is core 
processing section 808, the core processing section 
1 105 is provided with a display section 809. In this case 
the display section 809 performs display while carrying 
out time synchronization of the input T-AV stream 11 03 
and T-metadata 11 04. 

[0115] If the core processing section 1105 is core 
processing section 1 002, the core processing section 
1105 is provided with a transfer section 1003 and cap- 
sulization section 1 006. In this case, the transfer section 

1 003 performs settings for transferring the T-AV stream 
1 1 03 and T-metadata 1 1 04 input by the transfer section 
1 003 to another information usage node, and performs 
time synchronization and output to the capsulization 
section 1 006 every MPU 303. The operation of the cap-, 
sulization section according to Embodiment 3 is similar 
to that of the capsulization section 207 of Embodiment 1 . 



[0116] The operation of the information usage section 
1101 will now be described below. The information us- 
age section 1101 extracts an AV stream 801 and meta- 
data 802 from the capsulized stream 103 input by the 

5 extraction section 803, and outputs them to the access 
section 804. After recording the AV stream 801 and 
metadata 802 in the storage section 108, the access 
section 804 reads an AV stream 805 and metadata 806, 
and outputs them to the synchronization section 807. 

10 The synchronization section 807 performs time syn- 
chronization every MPU 303 for the AV stream 805 and 
metadata 806 read by the access section 804, and out- 
puts them to the conversion section 1102. The conver- 
sion section 11 02 then converts AV stream 81 0 accord- 

15 ing to metadata 811, and outputs the results to the core 
processing section 1105 as a T-AV stream 1103 and T- 
metadata 1104. 

[01 17] Then, if the core processing section 1 1 05 is the 
core processing section 808 according to Embodiment 

20 2, the display section 809 performs display while carry- 
ing out time synchronization of the input T-AV stream 
1103 and T-metadata 1104. If the core processing sec- 
tion 1 1 05 is the core processing section 1 002 according 
to Embodiment 1, the transfer section 1003 performs 

25 settings for transferring the T-AV stream 1103 and T- 
metadata 1 1 04 input by the transfer section 1 003 to an- 
other information usage node, and performs time syn- 
chronization and output to the capsulization section 
1006 every MPU 303. The capsulization section 1006 

30 recapsulizes the input T-AV stream 1 1 03 and T-metada- 
ta 1104, and transmits them as a capsulized stream 
1007. 

[01 18] As described above, according to Embodiment 
. 4, it is possible for the place where conversion process- 
es ing is performed according to metadata to be made var- 
iable by having the information usage section 1101 pro- 
vided with an extraction section 803 for separating and 
extracting an AV stream and metadata, an access sec- 
tion 804 for reading and writing an AV stream and rneta- 
40 data in a storage section 1 08 : a synchronization section 
807 for performing synchronization of the read AV 
stream and metadata processing, and, as the core 
processing section 1105, a usage program composed 
of a display section 809 or a transfer section 1003 and 
45 capsulization section 1 006. The place where conversion 
processing is performed may be, for example, a server, 
terminal, network node (gateway), or the like. 
[0119] Moreover, according to Embodiment 4, it is 
possible to make processing for a segment comprising 
50 part of an AV stream variable. Also, AV stream and 
metadata conversion can be made possible. 
[01 20] Furthermore, according to Embodiment 4, per- 
forming further processing on a converted AV stream 
and metadata can be made possible. 
55 [01 21 ] Still further, according to Embodiment 4, by us- 
ing a structured description written using XML for meta- 
data andmetadata units, and performing structured de- 
scription re-format from metadata to units and from units 
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to metadata, it is possible to extend the degree of free- 
dom for designing metadata for processing an AV 
stream, and a structured description written in XML, etc., 
can be used directly as metadata. 
[0122] In addition, according to Embodiment 4, it is 
possible for information relating to methods for process- 
ing metadata in the core processing section 1105-the 
display method, transfer method, and capsulization 
method-to be made metadata. 

(Embodiment 5) 

[0123] Next, an information processing system ac- 
cording to Embodiment 5 of the present invention will 
be described. FIG. 12 is a block diagram of an informa- 
tion processing system according to Embodiment 5. 
Parts that have already been described are assigned 
the same reference numerals. 

[0124] Embodiment 5 has a configuration that omits 
the processing for synchronizing an AV stream and 
metadata from the information provision section 1 04 ac- 
cording to Embodiment 1 . By omitting synchronization 
processing in this way, when synchronization of an AV 
stream and metadata is not necessary, processing 
speed can be increased by omitting synchronization 
processing and the configuration can be simplified. Ex- 
amples of cases where synchronization of an AV stream 
and metadata need not be performed include cases 
where metadata is sent all together as with header in- 
formation and processing need only be performed unit 
by unit, where it is sufficient for metadata to be synchro- 
nized implicitly with the AV stream, where it is sufficient 
for predetermined control to be performed by the termi- 
nal on the information usage side, and where metadata 
need not be processed in real time. 
[01 25] The configuration of an information processing 
system according to Embodiment 5 will now be de- 
scribed below. 

[0126] An information provision node 1201 is provid- 
ed with a storage section 1 02 in which an AV stream and 
AV stream related metadata are stored. The metadata 
is data that describes the related AV stream, or data for 
processing the metadata itself, or the like. Also provided 
in the information provision node 1201 is an information 
provision section 1204 that capsulizes the AV stream 
and metadata stored in the storage section 1 02 and gen- 
erates and outputs a capsulized stream 1203. The in- 
formation provision section 1204 transmits the cap- 
sulized stream 1203 via a network 1 05 to an information 
usage node 1206, which is an apparatus on the infor- 
mation receiving side. 

[0127] Meanwhile, the information usage node 1206 
is provided with an information usage section 1207 that 
extracts an AV stream and metadata from the cap- 
sulized stream 1203 and executes predetermined 
processing on them in order to use them. The informa- 
tion usage node 1206 is also provided with a storage 
section 1 08 that stores the AV stream and metadata ex- 



tracted by the information usage section 1 207. The in- 
formation usage section 1207 reads the AV stream and 
metadata stored in the storage section 108 in order to 
use them. 

5 [0128] Next, the information provision section 1204 
will be described using FIG. 13. FIG. 13 is a block dia- 
gram of an information provision section according to 
Embodiment 5. 

[0129] The information provision section 1204 is pro- 
w vided with an access section 1301 that reads an AV 
stream and metadata from the storage section 1 02. The 
access section 1301 outputs an AV stream 1302 and 
metadata 1303 to a unitization section 1304. 
[0130] The unitization section 1 304 reforms metadata 
15 1306 read by the access section 1301 into MPUs 303, 
and also outputs the synchronized AV stream 1305 and 
metadata 1306 read by the access section 1301 to a 
capsulization section 1307. 

[0131] The capsulization section 1 307 capsulizes the 
20 input AV stream 1 305 and metadata 1 306, and transmits 
them to the information usage node 1206 as a cap- 
sulized stream 1203. 

[0132] In Embodiment 5, as in Embodiment 1 , meta- 
data is unitized to enable it to be executed in parts. Then, 

25 the AV stream and metadata units are packetized, data 
stream packets and metadata unit packets are cap- 
sulized. and a capsulized stream is generated. 
[0133] The operation of the information provision sec- 
tion 1204 of the present invention will be described in 

30 detail below. Details of the AV stream 1302 and meta- 
data 1303 stored in the storage section 102 are the 
same as for the AV stream 202 and metadata 203 ac- 
cording to Embodiment 1 , so a description of these will 
be omitted here. 

35 [01 34] With the above-described conf iguration, meta- 
data 1303 and an AV stream 1302 are read from the 
storage section 102 by the access section 1301. Then 
the access section 1301 outputs the read AV stream 
1302 and metadata 1 303 to the unitization section 1304. 

40 [01 35] On receiving the AV stream 1 302 and metada- 
ta 1303, the unitization section 1304 first proceeds to 
processing for unitizing the metadata 1303. 
[0136] Definitions of the metadata 1303 and MPUs 
303 are the same as for the metadata 203 according to 

45 Embodiment 1 and the MPUs 303 described in Embod- 
iment 1 , so a description of these will be omitted here. 
Also, the process of unitization of the metadata 1303 is 
the same as for unitization of the metadata 203 accord- 
ing to Embodiment 1 , so a description of this will be omit- 

50 ted here. 

[01 37] According to metadata definition 401 shown in 
FIG.4A, metadata 1303 is represented by a collection 
of MPU definitions 402. Therefore, metadata 1303 is 
given a structured description by means of metadata 
55 definition 401 , and is stored in the storage section 1 02 
as metadata (XML instance) 501 shown in FIG.5A. 
[0138] Also, according to MPU definition 402 shown 
in FIG. 4B, an MPU 303 is represented by a collection 
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of metadata defined by user_defined.dtd. Therefore, 
MPUs 303 are given a structured description for each 
MPU by means of MPU definitions 402, and are stored 
in the storage section 102 as MPU (XML instance) 502 
shown in FIG.5B. 

[0139] An MPU 303 has contents <mpu> to </mpu>. 
That is to say, if there is information from <mpu> to </ 
mpu>, the unitization section 1304 can grasp MPU 303 
contents and can perform MPU 303 processing. Forthis 
reason, when picking out an MPU 303 from metadata 
1303, the unitization section 1304 extracts the contents 
on the inside of a tag called an MPU tag (here, <mpu>) 
defined by an MPU definition 402. 
[01 40] By having metadata 1 303 composed of lower- 
level information MPUs 303 in this way, the unitization 
section 1 304 can perform metadata 1303 processing for 
each MPU 303. By this means, the unitization section 
1 304 can process AV data 1 302 and metadata 1 303 unit 
by unit. 

[0141] Next, as in Embodiment 1, the capsulization 
section 1307 capsulizes metadata 1306 sent from the 
unitization section 1304 using the syntax shown in FIG. 
6. 

[0142] The capsulization section 1307 then capsuliz- 
es the AV stream segment for processing specified by 
the first packet's processing start time 607 and duration 
608, and part of the metadata 1303 corresponding to 
the segment for processing, as a capsulized stream {pri- 
vate PES). 

[0143] The unitization section 1304 then packetizes 
MPUs 303 into private PES packets and interleaves 
these with video PES packets and audio PES packets. 
[0144] Then the capsulization section 207 capsulizes 
the input AV stream 1305 and metadata 1306, and 
transmits them as a capsulized stream 1203. 
[01 45] As described above, according to Embodiment 
5, metadata can be re-formatted unit by unit and cap- 
sulized with an AV stream by providing a unitization sec- 
tion 1304 that unitizes the AV stream and metadata, and 
a capsulization section 1 307 that capsulizes the meta- 
data unit by unit with the AV stream. By this means, it 
becomes possible to perform partial execution of meta- 
data, and to carry out program distribution for process- 
ing a segment comprising part of an AV stream, speed- 
ing up of response times, reduction of the necessary 
storage capacity, and reduction of network traffic. 
[0146] Moreover, since Embodiment 5, unlike Em- 
bodiment 1, omits synchronization processing, when 
synchronization of an AV stream and metadata is not 
necessary, processing speed can be increased by omit- 
ting synchronization processing and the configuration 
can be simplified. 

(Embodiment 6) 

[0147] Next, an information processing system ac- 
cording to Embodiment 6 of the present invention will 
be described. FIG. 14 is a block diagram of an informa- 



tion usage section 1207 according to Embodiment 6. 
[0148] Embodiment 6 has a configuration that omits 
the processing for synchronizing an AV stream and 
metadata from the information usage section 1 07 ac- 
5 cording to Embodiment 2. By omitting synchronization 
processing in this way, when synchronization of an AV 
stream and metadata is not necessary, processing 
speed can be increased by omitting synchronization 
processing and the configuration can be simplified. Ex- 
10 amples of cases where synchronization of an AV stream 
and metadata need not be performed include cases 
where metadata is sent all together as with header in- 
formation and processing need only be performed unit 
by unit, where it is sufficient for metadata to be synchro- 
15 nized implicitly with the AV stream, where it is sufficient 
for predetermined control to be performed by the termi- 
nal on the information usage side, and where metadata 
need not be processed in real time. 
[0149] The configuration of an information processing 
system according to Embodiment 6 will now be de- 
scribed below. 

[01 50] An information usage section 1 207 is provided 
with an extraction section 1 403 that extracts and outputs 
an AV stream 1401 and metadata 1402 from an input 
capsulized stream 1203. The extraction section 1403 
outputs the extracted AV stream 1401 and metadata 
1402 to an access section 1404. 

[0151] The access section 1404 records the AV 
stream 1401 and metadata 1402 in a storage section 
108. Also, the access section 1404 reads an AV stream 

1405 and metadata 1406 stored in the storage section 
108, and outputs them to a core processing section 
1407. 

[0152] The core processing section 1407 operates in 
the same way as the core processing section 808 shown 
in Embodiment 2. If the core processing section 1 1 05 is 
core processing section 808, the core processing sec- 
tion 1 407 is provided with a display section 1408. In this 
case the display section 1408 displays the input AV 
stream 1405 and metadata 1406. 
[01 53] In this way, the information usage section 1 207 
extracts an AV stream 1401 and metadata 1402 from 
the capsulized stream 1203 in the extraction section 
1 403. Then, the display section 1 408 displays metadata 

1406 and AV stream 1405 unit by unit. 
[01 54] The operation of the information usage section 
1207 will now be described below. The information us- 
age section 1 207 extracts an AV stream 1 401 and meta- 
data 1 402 from the capsulized stream 1 203 input b%the 
extraction section 1 403, and outputs them to the access 
section 1404. After recording the AV stream 1401 and 
metadata 1402 in the storage section 108, the access 
section 1404 reads an AV stream 1405 and metadata 

1406, and outputs them to the core processing section 

1407. In the core processing section 1407, the display 
section 1408 displays the input AV stream 1405 and 
metadata 1406. 

[01 55] As described above, according to Embodiment 
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"6, it is possible to make processing for a segment com- 
prising part of a data stream variable by providing an 
extraction section 1403 for separating and extracting an 
AV stream and metadata, an access section 1404 for 
reading and writing an AV stream and metadata in a 
storage section 108, and a display section 1408, which 
is a core processing section 1407. 
[0156] Moreover, since Embodiment 6, unlike Em- 
bodiment 2 t omits synchronization processing, when 
synchronization of an AV stream and metadata is not 
necessary, processing speed can be increased by omit- 
ting synchronization processing and the configuration 
can be simplified. 

[0157] Embodiment 6 has been described as having 
a configuration in which the synchronization section 807 
is omitted from Embodiment 2, but a configuration may 
also be used in which the synchronization section 807 
is omitted from Embodiment 3 or 4. 
[0158] In Embodiment 1 to Embodiment 6, each 
processing section is configured by having all or part of 
the respective operations stored as a program (soft- 
ware) on a computer-readable storage medium such as 
a CD-ROM or DVD, and having the operations of each 
processing section performed by the CPU of a compu- 
ter, or the like, by having a computer read the program. 
[01 59] A mode is also possible whereby all or part of 
the operations of each processing section are stored on 
a storage medium on communication means such as 
the Internet or the like as a program (software), the pro- 
gram is downloaded to an information terminal via the 
Internet or the like, and the operations of each process- 
ing section are performed by the information terminal. 
[0160] A mode is also possible whereby each 
processing section is configured using dedicated hard- 
ware. 

[0161] In Embodiment 1 to Embodiment 6, descrip- 
tions have used an AV stream as a content data stream 
with time wise continuity, but the same kind of effects as 
in the above-described embodiments can be obtained 
with not an AV stream but another stream, file, or small- 
volume information, as long as its use as a stream is 
considered useful. 

[0162] In Embodiment 1 to Embodiment 6, metadata 
definitions and MPU definitions are performed using 
DTD of XML, but XML RDF or XML Schema may be 
used, or other definition means may also be used. 
[0163] In Embodiment 1 to Embodiment 6, packetiza- 
tion has been described with MPEG-2 system PES 
packets, but an MPEG-1 system, MPEG-4, SMPTE An- 
cillary Data Packet, or another transmission format, 
streaming format or file format may also be used. 
[0164] In Embodiment 1 to Embodiment 6, private 
PES has been used for the description of the transmis- 
sion layer for sen dingmeta data, but metadata PES, 
MPEG-7 PES, MPEG-2 PSI (Program Specific Informa- 
tion) Section (so-called carousel) promised for the future 
may also be used as a transmission layer 
[0165] In Embodiment 1 to Embodiment 4, as a syn- 



chronization variation, one MPU may also be inserted 
repeatedly to enable the necessary data to be received 
when starting reception midway. 
[0166] In Embodiment 1 to Embodiment 6, the net- 

5 work 1 05 or 1 505 may be a terrestrial broadcasting net- 
work, a satellite broadcasting network, a cable television 
network, a line switching network, a packet switching 
network, an ATM, the Internet, or another network, pack- 
age medium, hard disk, memory, or the like. 

10 [01 67] This application is based on the Japanese Pat- 
ent Application No.HEl 11-200095 filed on July 14, 
1999, entire content of which is expressly incorporated 
by reference herein. 

is Industrial Applicability 

[0168] As described above, according to the present 
invention, firstly, partial execution of metadata is made 
possible, and it is possible to carry out program distri- 
ct? bution for processing a segment comprising part of an 
AV stream, speeding up of response times, reduction of 
the necessary storage capacity, and reduction of net- 
work traffic, by reconfiguring metadata unit by unit and 
capsulizing it with an AV stream; secondly, close syn- 
25 chronization between metadata and AV stream process- 
ing times can be performed by making processing of a 
segment comprising part of an AV stream variable; and 
thirdly, it is possible to extend the degree of freedom for 
designing metadata for processing an AV stream, and 
30 to use a structured description written in XML, etc., di- 
rectly as metadata, by using a structured description by 
means of XML for metadata and metadata units, and 
performing structured description re-format from meta- 
data to units and from units to metadata. 

35 

Claims 

1. An information provision apparatus comprising: 

40 

a data stream generation source which gener- 
ates a data stream of content that has timewise 
continuity; 

a metadata generation source which generates 
45 metadata which is data that describes said data 

stream content and that is unitized in corre- 
spondence to a segment of said data stream; 
and 

a capsulization section which capsuiizes said 
so data stream packets and said metadata unit 

packets and generating a capsulized stream. 

2. The information provision apparatus according to 
claim 1 , wherein said metadata unit packet is placed 

55 ■ so that processing of said metadata unit is complet- 
ed before the processing start time of a correspond- 
ing segment of said data stream. 



12 



•If* ,CD 110WQG41 I ■* 



23 EP 1 193 899 A1 



24 



3. The information provision apparatus according to 
claim 1 , wherein said metadata packet includes the 
processing start time of the first packet of said cor- 
responding segment of said data stream, and the 
duration of that segment. 

4. The information provision apparatus according to 
claim 1, wherein said metadata is described by 
structured description. 

5. The information provision apparatus according to 
claim 1 , wherein said metadata unit is described by 
structured description. 

6. The information provision apparatus according to 
claim 4, wherein said structured description is de- 
fined by means of DTD of XML. 

7. The information provision apparatus according to 
claim 4, wherein said structured description is de- 
fined by means of RDF of XML. 

8. The information provision apparatus according to 
claim 4, wherein said structured description is de- 
fined by means of XML Schema. 

9. The information provision apparatus according to 
claim 5, wherein said structured description is de- 
fined by means of DTD of XML. 

10. The information provision apparatus according to 
claim 5, wherein said structured description is de- 
fined by means of RDF of XML. 

11. The information provision apparatus according to 
claim 5, wherein said structured description is de- 
fined by means of XML Schema. 

12. An information provision apparatus comprising: 

a data stream generation source which gener- 
ates a data stream of content that has timewise 
continuity; 

a metadata generation source which generates 
metadata which is data that relates to said data 
stream content and that is unitized in corre- 
spondence to a segment of said data stream; 
and 

a capsulization section which capsulizes said 
data stream packets and said metadata unit 
packets and generating a capsulized stream. 

13. An information provision apparatus comprising: 

a data stream generation source which gener- 
ates a data stream of content that has timewise 
continuity; 

a metadata generation source which generates 



metadata which is data that describes said data 
stream content and that is unitized in corre- 
spondence to a segment of said data stream; 
a synchronization section which synchronizes 

5 said data stream segment and its correspond- 

ing said metadata unit; and 
a capsulization section which capsulizes post- 
synchronization data stream packets and 
metadata unit packets and generates a cap- 

10 sulized stream. 

14. An information receiving apparatus comprising: 

an extraction section which extractes a content 
is data stream and metadata that describes that 

content from a capsulized stream; and 
a processing section whichr processes unit by 
unit said metadata that has been unitized in cor- 
respondence to a segment of said data stream. 

20 

15. The information receiving apparatus according to 
claim 14, wherein said units are merged in accord- 
ance with restriction information for merging said 
metadata units. 

25 

16. The information receiving apparatus according to 
claim 14, wherein said processing section displays 
said metadata. 

30 17. The information receiving apparatus according to 
claim 14, wherein said processing section converts 
said data stream in accordance with conversion 
processing defined by said metadata. 

35 18. The information receiving apparatus according to 
claim 14, wherein said processing section capsuliz- 
es data stream packets and metadata unit packets 
and transfers capsulized said data stream packets 
and capsulized metadata unit packets to another 

40 node. 

19. The information receiving apparatus according to 
claim 14, wherein said processing section collects 
together a plurality of metadata, and processes a 

45 plurality of said metadata together. 

20. An information receiving apparatus comprising: 

an extraction section which extractes a content 
so data stream and metadata that describes that 

content from a capsulized stream; 
a synchronization section which synchronizes 
unit by unit said metadata unitized in corre- 
spondence to a segment of said data stream 
55 with said content data stream and its corre- 

sponding metadata unit; and 
a processing section which processes synchro- 
nized metadata unit by unit. 



13 



XSltt <EP 1 193899A1 J_> 



25 



EP 1 193 899 A1 



26 



21. The information receiving apparatus according to 
claim 20, wherein said synchronization section syn- 
chronizes said data stream segment and its corre- 
sponding said metadata unit stored in a storage 
section. 

22. A storage medium that can be read by a computer, 
and that stores an information provision program 
that reads a data stream of content that has time- 
wise continuity and metadata which is data that de- 
scribes said data stream content and that is unitized 
in correspondence to a segment of said data 
stream, and synchronizes said data stream seg- 
ment and its corresponding said metadata unit for 
generating a capsulized data stream. 

23. The storage medium according to claim 22, wherein 
a program is stored for placing said metadata unit 
packet so that processing of said metadata unit is 
completed before the processing start time of a cor- 
responding segment of said data stream. 

24. The storage medium according to claim 22, wherein 
said metadata is described by structured descrip- 
tion. 

25. The storage medium according to claim 22, wherein 
said metadata unit is described by structured de- 
scription. 

26. An information communication system comprising: 

an information provision apparatus that has a 
data stream generation source which gener- 
ates a data stream of content that has timewise 
continuity, a metadata generation source which 
generates metadata which is data that de- 
scribes said data stream content and that is uni- 
tized in correspondence to a segment of said 
data stream, and a capsulization section which 
capsulizes said data stream packets and said 
metadata unit packets and generates a cap- 
sulized stream; and 

an information receiving apparatus that has an 
extraction section which extracts a content data 
stream and metadata that describes that con- 
tent from said capsulized stream generated by 
said information provision apparatus, and a 
processing section which processes unit by unit 
said metadata that has been unitized in corre- 
spondence to a segment of said data stream 
and said content data stream and its corre- 
sponding metadata unit. 

27. An information communication system comprising: 

an information provision apparatus that has a 
data stream generation source which gener- 



ates a data stream of content that has timewise 
continuity, a metadata generation source which 
generates metadata which is data that de- 
scribes said data stream content and that is uni- 

s tized in correspondence to a segment of said 

data stream, a synchronization section which 
synchronizes said data stream segment and its 
corresponding said metadata unit, and a cap- 
sulization section which capsulizes said data 

10 stream packets and said metadata unit packets 

and generates a capsulized stream; and 
an information receiving apparatus that has an 
extraction section which extractes a content da- 
ta stream and metadata that describes that 

15 content from said capsulized stream generated 

by said information provision apparatus, a syn- 
chronization section which synchronizes unit 
by unit said metadata unitized in correspond- 
ence to a segment of said data stream with said 

20 content data stream and its corresponding 

metadata unit, and a processing section which 
processes synchronized metadata unit by unit. 

28. An information provision method comprising: 

25 

generating a segment of a data stream of con- 
tent that has timewise continuity and metadata 
which is data that describes said data stream 
content and that is unitized in correspondence 
30 to a segment of said data stream; and 

capsulizing said data stream packets and said 
metadata unit packets and generating a cap- 
sulized stream. 

35 29. An information provision method comprising: 

synchronizing a segment of a data stream of 
content that has timewise continuity and a unit 
of metadata which is data that describes said 
40 data stream content and that is unitized in cor- 

respondence to a segment of said data stream; 
and 

capsulizing post-synchronization data stream 
packets and metadata unit packets and gener- 
is ating a capsulized stream. 

30. An information receiving method comprising: 

extracting a content data stream and metadata 
so that describes that content from a capsulized 

stream; and 

processing unit by unit said metadata that has 
been unitized in correspondence to a segment 
of said data stream and said data stream. 

55 

31 . An information receiving method comprising: 

extracting a content data stream and metadata 
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that describes that content from a capsulized 
stream; 

synchronizing unit by unit said metadata uni- 
tized in correspondence to a segment of said 
data stream with said content data stream and s 
its corresponding metadata unit; and 
processing synchronized metadata unit by unit. 
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